Hannes Datta
Using virtual environment '/Users/hannesdatta/.virtualenvs/r-reticulate' ...
Using virtual environment '/Users/hannesdatta/.virtualenvs/r-reticulate' ...
We're about to start with the first lecture of this class.
If you haven't done so, please
Substantive interests
Methodological interests
With web scraping, you can capture anything you can view in a web browser
With APIs, you obtain official data from a firm in a programmatic way
import requests
url = 'https://music-to-scrape.org/'
webrequest = requests.get(url)
from bs4 import BeautifulSoup
soup = BeautifulSoup(webrequest.text)
weekly15 = soup.find('section', {'name':'weekly_15'})
for song in weekly15.find_all('h5'): print(song.text)
The Smashing Pumpkins
Robert Lockwood_ Jr.
Stevie Ray Vaughan And Double Trouble
Joi
Vangelis
Xzibit featuring Jayo Felony and Method Man
Tad
Brian Eno And David Byrne
Mongo Santamaria
Hatebreed
Chelsea
Liars
Lili Ivanova
Sonny Terry & Brownie McGhee
Snow Patrol
APIs are official interfaces by firms for programmers to extract or submit data, or obtain access to an algorithm
They work like websites (i.e., you can call them with the same snippets as before), but usually you need to pay or at least sign up for the service
# let's get some data from the API of music-to-scrape
api_request = requests.get('https://api.music-to-scrape.org/charts/top-tracks')
processing file: slides.Rpres
Quitting from lines 225-227 [unnamed-chunk-5] (slides.Rpres)
Execution halted